Language modeling for sentence retrieval: A comparison between Multiple-Bernoulli models and Multinomial models

نویسنده

  • David E. Losada
چکیده

In this work we focus on a sentence retrieval task to present a comparison between Language Modeling based on a multi-variate Bernoulli distribution and Language Modeling based on the popular multinomial models. Nowadays, a view on text generation as a multiple Bernoulli process is not predominant in Language Modeling for Information Retrieval but we show how the characteristics of the task are appropriate for a statistical Language Modeling based on a multi-variate Bernoulli distribution.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ZIP and data document visualization

Text data modeling has been usually considered with Bernoulli or multinomial event models. Poisson distribution is considered inefficient for text information retrieval. In this work, we propose to incorporate the Zero Inflated Poisson model in the Generative Topographic Mapping algorithm. The modified algorithm is presented as a text document cluster extraction and visualization tool. Experime...

متن کامل

A Comparison of Event Models for Naive Bayes Text Classification

Recent approaches to text classification have used two different first-order probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multi-variate Bernoulli model, that is, a Bayesian Network with no dependencies between words and binary word features (e.g. Larkey and Croft 1996; Koller and Sahami 1997). Others use a multinomial model, that is, a uni-g...

متن کامل

A Comparison of Event Models for Naive Bayes Text Classi cation

Recent approaches to text classi cation have used two di erent rst order probabilistic models for classi ca tion both of which make the naive Bayes assumption Some use a multi variate Bernoulli model that is a Bayesian Network with no dependencies between words and binary word features e g Larkey and Croft Koller and Sahami Others use a multinomial model that is a uni gram language model with i...

متن کامل

Relevance Feedback Models for Recommendation

We extended language modeling approaches in information retrieval (IR) to combine collaborative filtering (CF) and content-based filtering (CBF). Our approach is based on the analogy between IR and CF, especially between CF and relevance feedback (RF). Both CF and RF exploit users’ preference/relevance judgments to recommend items. We first introduce a multinomial model that combines CF and CBF...

متن کامل

An Efficient Computation of the Multiple-Bernoulli Language Model

The Multiple Bernoulli (MB) Language Model has been generally considered too computationally expensive for practical purposes and superseded by the more efficient multinomial approach. While, the model has many attractive properties, little is actually known about the retrieval effectiveness of the MB model due to its high cost of execution. In this paper, we show how an efficient implementatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005